class: title-slide <style type="text/css"> :root { --main-logo: url(https://raw.githubusercontent.com/mcanouil/hex-stickers/master/SVG/mc.svg); } .red { color: #b20000; } .green { color: #00b200; } .blue { color: #0000b2; } </style> <div class = "main-logo"></div><div class = "side-logo"></div> # My Journey To Transparency And Reproducibility ## <i class = "fab fa-docker"></i> & <i class = "fab fa-r-project"></i> ### MickaĆ«l Canouil, *Ph.D.* (<a href = "http://m.canouil.fr/" target = "_blank"><i class = "fas fa-home"></i> m.canouil.fr</a>) ### Inserm U1283 / CNRS UMR8199 / Institut Pasteur de Lille ### The 16 of March, 2021 --- class: part-slide # Who Am I?  --- # Who Am I? .font50[(_Definitely Not Iron Man ..._)] .pull-left[ * __Head of the Biostatistics Team__ of the __Inserm U1283 / CNRS UMR 8199__ unit _(Functional (Epi)genomics and Molecular Physiology of Diabetes and Related Diseases)_. <img src = "data:image/png;base64,#https://raw.githubusercontent.com/mcanouil/hex-stickers/master/SVG/umr1283_8199.svg" width = "100px" /> * __Doctor of Philosophy (Ph.D.) in Biostatistics__. * Authored __4 <i class = "fab fa-r-project"></i> packages__ on CRAN (more on Github). <img src = "data:image/png;base64,#https://raw.githubusercontent.com/mcanouil/hex-stickers/master/SVG/nacho.svg" width = "100px" /> <img src = "data:image/png;base64,#https://raw.githubusercontent.com/mcanouil/hex-stickers/master/SVG/insane.svg" width = "100px" /> <img src = "data:image/png;base64,#https://github.com/mcanouil/ggpacman/raw/master/man/figures/ggpacman.gif" width = "100px" /> * Contributed to __2 <i class = "fab fa-r-project"></i> packages__ on CRAN. * .font80[_Watched 2,850 movies so far ..._] ] .pull-right[ .center[ <img src = "data:image/png;base64,#https://github.com/mcanouil/ggpacman/raw/master/man/figures/README-pacman-1.gif" height = "450px" /> _From_ `ggpacman` _on CRAN._ _It's only an <i class = "fab fa-r-project"></i> package to make a GIF, sorry!_ ] ] --- class: part-slide # Where My Journey Started?  --- # With A Project-oriented Workflow --- # With A Project-oriented Workflow (_Or Not_) <img src = "data:image/png;base64,#resources/project5.png" width = 35% style = "position:absolute; top: 12%; left: 45%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> -- <img src = "data:image/png;base64,#resources/project2.png" width = 34% style = "position:absolute; top: 65%; left: 65%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> -- <img src = "data:image/png;base64,#resources/project3.png" width = 50% style = "position:absolute; top: 13%; left: 5%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> <img src = "data:image/png;base64,#https://media.giphy.com/media/116a8zosxwA0SI/giphy.gif" style = "position:absolute; top: 40%; left: 83%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> -- <img src = "data:image/png;base64,#resources/project4.png" width = 30% style = "position:absolute; top: 72%; left: 10%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> -- <img src = "data:image/png;base64,#resources/project6.png" width = 35% style = "position:absolute; top: 15%; left: 30%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> -- <img src = "data:image/png;base64,#resources/project1.png" width = 45% style = "position:absolute; top: 15%; left: 47%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> --- class: part-slide # Project Structure? --  -- .font200[ā I can fix it!] --- # Something Simple .pull-left.font150[ * `Data` * `Docs` * `Report` * `Scripts` * `README.md` ] .pull-right[ <img src = "data:image/png;base64,#resources/project_good1.png" width = "43%" style = "position:absolute; top: 22%;" /> ] -- .center.font150.blue[ ā This solved one issue, but it was the obvious one ... </br> _i.e._, data, scripts, and outputs should not live in the same directory! ] -- .pull-left.font150[ * `.git` ā Let's add GIT, it can't hurt!<sup>*</sup> ] .pull-right[ <img src = "data:image/png;base64,#resources/project2_version.png" width = "100%"/> ] -- .footnote[ * It did a little bit more than the project structure. ] --- class: part-slide background-image: url(data:image/png;base64,#https://source.unsplash.com/zFYUsLk_50Y) background-size: cover <style type="text/css"> .bg-img { text-shadow: 2px 2px 5px #000000 } </style> # .bg-img[Infrastructure System] --- # Infrastructure System <div style = "font-size: 1000%; color: #333; position: absolute; top: 35%; left: 42%;"> <i class = "fas fa-server"></i> </div> -- <div style = "font-size: 500%; color: #333; position: absolute; top: 45%; left: 5%;"> <i class="fas fa-user" ></i><i class="fas fa-desktop"></i> </div> <div style = "font-size: 400%; color: #333; position: absolute; top: 47%; left: 29%;"> <i class="fas fa-arrows-alt-h"></i> </div> -- <div style = "font-size: 500%; color: #333; position: absolute; top: 45%; left: 78%;"> <i class="fas fa-user" ></i><i class="fas fa-desktop"></i> </div> <div style = "font-size: 400%; color: #333; position: absolute; top: 47%; left: 65%;"> <i class="fas fa-arrows-alt-h"></i> </div> <div style = "font-size: 500%; color: #333; position: absolute; top: 12%; left: 42%;"> <i class="fas fa-user" ></i><i class="fas fa-desktop"></i> </div> <div style = "font-size: 400%; color: #333; position: absolute; top: 27.5%; left: 48%;"> <i class="fas fa-arrows-alt-v"></i> </div> <div style = "font-size: 500%; color: #333; position: absolute; top: 79%; left: 42%;"> <i class="fas fa-user" ></i><i class="fas fa-desktop"></i> </div> <div style = "font-size: 400%; color: #333; position: absolute; top: 66.5%; left: 48%;"> <i class="fas fa-arrows-alt-v"></i> </div> -- <div style = "font-size: 150%; color: #0000b2; position: absolute; top: 49%; left: 13.75%;"> <i class="fab fa-r-project"></i> </div> <div style = "font-size: 150%; color: #b20000; position: absolute; top: 52%; left: 18%;"> <i class="fab fa-python"></i> </div> <div style = "font-size: 150%; color: #0000b2; position: absolute; top: 49%; left: 86.75%;"> <i class="fab fa-r-project"></i> </div> <div style = "font-size: 150%; color: #b20000; position: absolute; top: 52%; left: 91%;"> <i class="fab fa-python"></i> </div> <div style = "font-size: 150%; color: #0000b2; position: absolute; top: 16.25%; left: 50.75%;"> <i class="fab fa-r-project"></i> </div> <div style = "font-size: 150%; color: #b20000; position: absolute; top: 19.25%; left: 55%;"> <i class="fab fa-python"></i> </div> <div style = "font-size: 150%; color: #0000b2; position: absolute; top: 83.25%; left: 50.75%;"> <i class="fab fa-r-project"></i> </div> <div style = "font-size: 150%; color: #b20000; position: absolute; top: 86.25%; left: 55%;"> <i class="fab fa-python"></i> </div> -- <div style = "font-size: 200%; color: #0000b2; position: absolute; top: 51%; left: 44%;"> <i class="fab fa-r-project" style = "color: #0000b2;"></i> <i class="fab fa-python" style = "color: #b20000;"></i> </div> <div style = "font-size: 400%; color: #333; position: absolute; top: 45.45%; left: 14.5%;"> <i class="fas fa-times"></i> </div> <div style = "font-size: 400%; color: #333; position: absolute; top: 45.45%; left: 87.5%;"> <i class="fas fa-times"></i> </div> <div style = "font-size: 400%; color: #333; position: absolute; top: 12.6%; left: 51.5%;"> <i class="fas fa-times"></i> </div> <div style = "font-size: 400%; color: #333; position: absolute; top: 79.6%; left: 51.5%;"> <i class="fas fa-times"></i> </div> -- <div style = "font-size: 150%; position: absolute; top: 20%; left: 8%;"> ā Non-root users. </div> -- <div style = "font-size: 150%; color: #b20000; position: absolute; top: 20%; left: 67%;"> ā Not possible to install <i class = "fab fa-r-project"></i> packages or system libraries. </div> -- <div style = "font-size: 150%; position: absolute; top: 75%; left: 8%;"> ā Possible to install</br> <i class = "fab fa-r-project"></i> packages in user `home`. </div> -- <div style = "font-size: 150%; color: #b20000; position: absolute; top: 75%; left: 67%;"> ā Limited storage in `home`. </br> <i>Not a good place anyway ...</i> </div> --- # Lack Of A Reproducible Environment .font150[ .pull-left[ + __Code errors__ The same code can give different results on different platforms/machines. ] .pull-right[ + __Affecting multiple projects__ In a global and shared environment, any changes in system libraries and <i class = "fab fa-r-project"></i> packages can change or crash any unrelated projects. ] .pull-left[ + __Difficult system deployment (for user/developer)__ Establishing and maintaining infrastructure is challenging if not tracked properly, especially over time. ] .pull-right[ + __Painful collaboration__ Team/User will most likely waste time setting up new environment rather than starting developing. ] ] --- class: part-slide # Reproducible Development/Analysis Workflow</br>With</br>Docker</br><i class = "fab fa-docker" style = "font-size: 400%;"></i> --- # Build A Container With A Dockerfile <img src = "data:image/png;base64,#resources/dockerfile_v1.svg"/> -- <img src = "data:image/png;base64,#resources/umr1283_project.svg" style = "position:absolute; top: 12%; left: 53%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> <img src = "data:image/png;base64,#https://raw.githubusercontent.com/mcanouil/hex-stickers/master/SVG/umr1283_8199.svg" width = "100px" style = "position:absolute; top: 35%; left: 75%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> -- <style type="text/css"> .text-block { position: absolute; bottom: 5%; right: 5%; background-color: var(--bg-colour); color: var(--font-colour); float: right; width: 40%; box-shadow: 3px 5px 3px 1px #ffffff80; } </style> .text-block[ * More than __100 <i class = "fab fa-r-project"></i> packages__ pre-installed. ā Increasing over time to ensure "old" projects still work. * Still no way to install <i class = "fab fa-r-project"></i> packages without compromising the transparency/reproducibility of the Dockerfile used. ] --- # What Do We Have Now? .pull-left.green[ * .font150[__Good__] + A project structure __clear__. + __Flexibility at the system-level__ using a Dockerfile to build an image with <i class = "fab fa-r-project"></i> packages or any needed libraries. + __Reproducibility__, _i.e._, a project analysed using a specific Docker <i class = "fab fa-docker"></i> image can be re-analysed using that same Docker image. ] .pull-right.red[ * .font150[__Bad__] + Requires some __knowledge__ about system administration and Docker <i class = "fab fa-docker"></i>. + New __<i class = "fab fa-r-package"></i> packages cannot be installed__ without having to build a new Docker <i class = "fab fa-docker"></i> image. + <i class = "fab fa-r-package"></i> packages are __not project-specific__, unless you create a Docker <i class = "fab fa-docker"></i> image for each. + To build or run Docker, users are __required to be `root`__. ] .center.blue[ .font150[ ā What if there is a way to __install any__ <i class = "fab fa-r-package"></i> __packages__ (_and Python <i class = "fab fa-python"></i> modules_), to __record versions__, and to be able to automatically __restore/reinstall all the__ <i class = "fab fa-r-package"></i> __packages of a specific project__? ] .font120[ All that without any "interference" with the system (_e.g._, Docker container, laptop, _etc._). ] ] --- class: part-slide # <img src = "data:image/png;base64,#https://raw.githubusercontent.com/rstudio/hex-stickers/master/SVG/renv.svg" width = "200px" />  --- <style type="text/css"> .bqm { border-left: solid 5px var(--font-colour); padding-left: 1em; } .sign { float: right; width: 25%; } </style> # What Is `renv`? .bqm.font120[ The `renv` package helps you create __r__eproducible __env__ironments for your <i class = "fab fa-r-project"></i> projects. Use `renv` to make your R projects more: * __Isolated__: Installing a new or updated package for one project wonāt break your other projects, and vice versa. Thatās because `renv` gives each project its own private package library. * __Portable__: Easily transport your projects from one computer to another, even across different platforms. `renv` makes it easy to install the packages your project depends on. * __Reproducible__: `renv` records the exact package versions you depend on, and ensures those exact versions are the ones that get installed wherever you go. .sign[ [https://rstudio.github.io/renv/](https://rstudio.github.io/renv/) ] ] --- # How Does `renv` Work? --- # Docker <i class = "fab fa-docker"></i> & `renv` <img src = "data:image/png;base64,#resources/dockerfile_v2.svg"/> -- <img src = "data:image/png;base64,#resources/umr1283_project_renv.svg" style = "position:absolute; top: 12%; left: 64%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> <img src = "data:image/png;base64,#https://raw.githubusercontent.com/mcanouil/hex-stickers/master/SVG/umr1283_8199.svg" width = "100px" style = "position:absolute; top: 32%; left: 83%; box-shadow: 3px 5px 3px 1px #ffffff80;" /> --- # What Do We Have Now? .pull-left.green[ * .font150[__Good__] + A project structure __clear__. + __Flexibility at the system-level__ using a Dockerfile to build an image with system libraries. + __Flexibility at the project-level__ using `renv`. + New __<i class = "fab fa-r-package"></i> packages can be installed/restored__ without having to build a new Docker <i class = "fab fa-docker"></i> image. + <i class = "fab fa-r-package"></i> packages are __project-specific__. + __Reproducibility__, _i.e._, Docker + `renv`. The underlying system, its dependencies, and required <i class = "fab fa-r-package"></i> packages, are fixed and constant for a particular project. ] .pull-right.red[ * .font150[__Bad__] + Requires some __knowledge__ about system administration and Docker <i class = "fab fa-docker"></i>. + To build or run Docker, users are __required to be `root`__. ] --- # How To Reduce Cognitive Load For New Users? + .font150[Remaining issues] + Having to learn about system administration (_i.e._, Debian, Ubuntu, MacOS, Windows, _etc._) __can be a hassle__, especially for beginners. + Mostly for security reasons, __users should not be `root`__ for day-to-day work (or in general). -- + .font150[Solutions] + ā Build a Docker image with RStudio server, <i class = "fab fa-r-project"></i> and with "all" required system libraries (Docker containers as a _daemon_). + A user has __a non-root account__ to log in through a web browser to the IDE (_i.e._, RStudio). + A user __can develop, code an analysis__ in a shared environment with a __project-oriented workflow__ built around `renv`. + __No prior knowledge__ about Docker is required. + Scripts can still be launched directly in Docker without a _daemon_. + ā [__Singularity__](https://sylabs.io/singularity/) to run containers __without root privileges__, including Docker containers. --- class: part-slide #  --- class: part-slide # What Do We Have Now?  --- class: part-slide # <img src = "data:image/png;base64,#https://avatars1.githubusercontent.com/u/8896044?s=460&v=4" height = "150px" id = "picture" /> .pull-left[ <a href = "" target = "_blank"><i class = "fas fa-phone"></i> +33 (0) 374 00 81 29</a> ] .pull-right[ <a href = "mailto:mickael.canouil@cnrs.fr" target = "_blank"><i class = "fas fa-envelope"></i> mickael.canouil@cnrs.fr</a> ] .center[ <a href = "http://m.canouil.fr" target = "_blank"><i class = "fas fa-home"></i> m.canouil.fr</a> <a href = "https://github.com/mcanouil/" target = "_blank"><i class = "fab fa-github"></i> mcanouil</a> <a href = "https://rlille.fr" target = "_blank"><i class = "fab fa-r-project"></i> rlille.fr</a> ] .pull-left[ <a href = "https://www.linkedin.com/in/mickael-canouil/" target = "_blank"><i class = "fab fa-linkedin"></i> mickael-canouil</a> ] .pull-right[ <a href = "https://twitter.com/mickaelcanouil/" target = "_blank"><i class = "fab fa-twitter"></i> @mickaelcanouil</a> ]